Secondary structure motif determination in ncRNA via graph kernel based computational models

نویسندگان

  • Kiran Kumar Telukunta
  • Rolf Backofen
  • Martin Riedmiller
چکیده

ncRNA which is a functional molecule but yet not translated into protein has significantly taken importance in the field of bioinformatics, therapeutics chemoinformatics and for the advancement of science. The nucleotide composition and its structure (identity of paired and unpaired nucleotides) determine the function of ncRNA. Key analytical tools such as folding, alignment and clustering algorithms rely on energetic considerations to generate the accurate response to specific queries as they are designed. In reality, these algorithms become inaccurate while considering the non-linear effects with underlying assumptions (energy additivity), when violated. To overcome these eventualities, one can formulate key parameters in terms of nonlinear functional dependencies that can be learned from known examples (or parts of examples) or from suboptimal RNA structure prediction. Given the importance of the structural element in ncRNA these methods should ideally be able to work in structured domains i.e. they should be able to accept input graph data structures. The methods will belong to the family of kernel machines, since this class of algorithms allows to use heterogeneous features and to accept complex instances such as sequences of graphs as input. The aim of the thesis is to develop computation model capable of identifying subgraphs within the ncRNA folding graph that are characteristic of biological functions. Further subject them to kernel models to improve the RNA secondary structure and its prediction in terms of accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rnav: Non-coding Rna Secondary Structure Variation Search via Graph Homomorphism

Non-coding RNA (ncRNA) secondary structural homologs can be detected effectively in genomes with profile-based search methods. However, due to the lack of appropriate ncRNA structural evolution models, it is difficult to accurately detect distant structural homologs, i.e., ncRNA structures with variations caused by evolutionary changes such as the insertion or deletion of a substantial portion ...

متن کامل

A Computational Pipeline for High- Throughput Discovery of cis-Regulatory Noncoding RNA in Prokaryotes

Noncoding RNAs (ncRNAs) are important functional RNAs that do not code for proteins. We present a highly efficient computational pipeline for discovering cis-regulatory ncRNA motifs de novo. The pipeline differs from previous methods in that it is structure-oriented, does not require a multiple-sequence alignment as input, and is capable of detecting RNA motifs with low sequence conservation. W...

متن کامل

ncRNA discovery and functional identification via sequence motifs

Non-coding RNAs play regulatory roles in gene expression via establishing stable joint structures with target mRNAs through complementary sequence motifs. Sequence motifs are also important determinants of the structure of ncRNAs. Here we introduce two computational tools that both exploit differential distributions of short sequence motifs in ncRNAs for the purpose of identifying their loci an...

متن کامل

Journal of Integrative Bioinformatics

Non-coding RNAs (ncRNAs) contain both characteristic secondary-structure and short sequence motifs. However, “complex” ncRNAs (RNA bound to proteins in ribonucleoprotein complexes) can be hard to identify in genomic sequence data. Programs able to search for ncRNAs were previously limited to ncRNA molecules that either align very well or have highly conserved secondary-structure. The RNAmotif p...

متن کامل

Grammar string: a novel ncRNA secondary structure representation

Multiple ncRNA alignment has important applications in homologous ncRNA consensus structure derivation, novel ncRNA identification, and known ncRNA classification. As many ncRNAs’ functions are determined by both their sequences and secondary structures, accurate ncRNA alignment algorithms must maximize both sequence and structural similarity simultaneously, incurring high computational cost. F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011